Pupil Data Quality Report: Data Availability and Analysis Readiness for Chapters 2 & 3

BAP Study - Pupillometry Analysis Pipeline

Author

Mohammad Dastgheib

Published

January 3, 2026 at 04:03 PM

1 Executive Summary

1.1 What I Need From You

Decision Points

Please review and approve the following:

  1. Quality Thresholds: Approve thresholds for Chapter 2 (60%) and Chapter 3 (50% + RT filter)
  2. AUC Requirement for Pupil-Enhanced DDM: Approve whether to require auc_available == TRUE for the pupil-predictor DDM subset (see Section 6.2 for intersection counts)
  3. Sensitivity Analysis: Approve whether to run and report 50% and 70% threshold sensitivity analyses as robustness checks (even if brief)

This report provides a comprehensive overview of the pupil data collected in the BAP (Brain, Aging and Perception) study, with a focus on data quality, availability, and readiness for two dissertation chapters:

  • Chapter 2: Psychometric sensitivity and pupil-indexed arousal coupling analyses
  • Chapter 3: Drift Diffusion Model (DDM) analyses with pupil predictors

1.2 Key Findings

  • Total Participants: 63 participants with pupil data
  • Total Trials Available:
    • Total trials in dataset: 14,586
    • Trials with behavioral data: 12,715 (87.2% behavioral join rate)
    • Trials with both AUC metrics: 50.5% overall
  • Chapter 2 Readiness (Primary threshold: 60% validity):
    • Usable trials: 6,104 (41.8% of total trials)
    • Analysis-ready dataset: 14,586 trials in ch2_triallevel.csv (all trials with gate_pupil_primary flag indicating quality)
  • Chapter 3 Readiness (Primary threshold: 50% validity + behavioral RT filter):
    • Pupil+behavior-ready trials: 7,450 (51.1% of total trials)
    • Analysis-ready dataset: 14,586 trials in ch3_triallevel.csv (all trials with ddm_ready flag indicating quality)

Note: Both analysis-ready datasets contain the same 14,586 trials. They differ only in the quality flags provided (gate_pupil_primary for Chapter 2, ddm_ready for Chapter 3), allowing us to filter to the trials that meet each chapter’s quality criteria.

  • AUC Availability (Both Total AUC and Cognitive AUC):
    • Overall: 50.5%
    • ADT: 47.2%
    • VDT: 54.0%

1.3 Data by Task

Data Availability by Task (from quick_share_v7)
Task Total Trials Ch2 Ready (60%) Ch2 % AUC Ready AUC %
ADT 6404 2900 45.3 3010 47
VDT 6311 3440 54.5 3406 54

2 2. Introduction and Research Context

2.1 2.1 Study Overview

The BAP study examines how physical effort and cognitive demands interact in older adults using a dual-task paradigm combining handgrip force manipulation with perceptual discrimination tasks. Pupillometry provides a continuous, non-invasive measure of arousal that reflects both physical effort (tonic arousal) and cognitive engagement (phasic arousal).

2.2 2.2 Data Collection

  • Tasks: Auditory Discrimination Task (ADT) and Visual Discrimination Task (VDT)
  • Sessions: Scanner sessions 2-3 (session 1/practice excluded)
  • Runs: Up to 5 runs per task per session
  • Sampling Rate: 250 Hz (4 ms per sample)
  • Trial Structure: Each trial includes baseline, squeeze, stimulus presentation, and response periods

2.3 2.3 Dissertation Chapters

Chapter 2: Psychometric-Pupil Coupling

  • Tests how trial-wise pupil-indexed arousal (Cognitive AUC) modulates psychometric sensitivity
  • Requires high-quality baseline and cognitive window data (≥60% validity for primary analyses)
  • Uses Total AUC for effort manipulation checks and Cognitive AUC for psychometric coupling
  • Quality gates: baseline_quality >= 0.60 AND cog_quality >= 0.60 (from MATLAB pipeline quality metrics)

Chapter 3: DDM with Pupil Predictors

  • Examines how pupil-indexed arousal influences decision-making processes (drift rate, boundary separation)
  • Requires behavioral RT data (0.2-3.0s) plus moderate-quality pupil data (≥50% validity)
  • Can also run behavior-only DDM models without pupil requirements

3 3. Overall Data Availability

3.1 3.1 Participant-Level Summary

**Per-Participant Summary Statistics:**
**Chapter 2:**
- Median total trials per participant: 270 (IQR: 195-270)
- Median Chapter 2 ready trials per participant: 80
- Median Chapter 2 retention rate per participant: 37.0% (60% quality threshold)
**Chapter 3:**
- Median total trials per participant: 270 (IQR: 195-270)
- Median Chapter 3 ready trials per participant: 115
- Median Chapter 3 retention rate per participant: 55.6% (50% quality threshold + RT filter)
*Note: Chapter 3 has higher retention rates because it uses a more lenient quality threshold (50% vs 60%). This prioritizes sample size for DDM analyses, while Chapter 2 prioritizes high-quality data for psychometric coupling analyses.*

3.2 3.2 Task and Condition Breakdown

Data Availability by Task (Chapters 2 & 3)
Task N Subjects Total Trials Ch2 Ready Ch2 % Ch3 Ready Ch3 % Mean Trials/Subject
ADT 57 6404 2886 38.3 3546 47.1 132.1
VDT 59 6311 3218 45.6 3904 55.3 119.6

4 4. Data Quality and Threshold Analysis

4.1 4.1 Quality Gates and Thresholds

Data quality is assessed using window validity metrics, which measure the proportion of valid (non-missing) pupil samples within critical time windows. All times are relative to squeeze onset (trial onset) = 0.0s.

Quality Metrics Used:

The quality gates are based on two metrics computed by the MATLAB preprocessing pipeline:

  1. baseline_quality: Proportion of valid samples in the baseline period
    • Window: Pre-trial baseline period (typically the last portion of the ITI baseline)
    • Used to ensure sufficient data for baseline correction
  2. cog_quality: Proportion of valid samples in the cognitive period
    • Window: Post-target period (typically from target onset through the response window)
    • Used to ensure sufficient data for cognitive pupil response measurement

Quality Thresholds:

Trials must meet minimum validity thresholds in both baseline and cognitive windows: - Chapter 2 Primary: baseline_quality >= 0.60 AND cog_quality >= 0.60 (60% valid samples required) - Chapter 2 Sensitivity: Thresholds at 50% and 70% for robustness checks - Chapter 3 DDM: baseline_quality >= 0.50 AND cog_quality >= 0.50 (50% valid samples required) plus RT filter (0.2-3.0s)

Key Timing Reference Points: - Squeeze onset (TrialST): 0.0s (reference point for all times) - Target stimulus onset: 4.35s (relative to squeeze onset) - Response window start: 4.70s (relative to squeeze onset)

Chapter 2 Requirements:

  • Primary analysis: Baseline validity ≥ 60% AND Cognitive validity ≥ 60%
    • Baseline window: -0.5s to 0.0s (relative to squeeze onset) for B0 baseline
    • Cognitive window: Defined by MATLAB pipeline cog_quality metric
  • Sensitivity analyses: Thresholds at 50% and 70% for robustness checks

Chapter 3 Requirements:

  • DDM with pupil: Baseline validity ≥ 50% AND Cognitive validity ≥ 50% AND RT 0.2-3.0s
    • Baseline window: -0.5s to 0.0s (relative to squeeze onset) for B0 baseline
    • Cognitive window: Defined by MATLAB pipeline cog_quality metric
    • RT filter: 0.2s to 3.0s (excludes anticipatory and timeout responses)
  • Behavior-only DDM: No pupil quality requirements (uses all behavioral trials)

4.2 4.2 Threshold Sensitivity Analysis

This section examines how data retention changes across different quality thresholds to justify our threshold selections for each chapter.


**Retention Rates by Threshold:**
Threshold Sensitivity: Trial Retention Rates
Task Threshold Total Trials Trials Passing Retention Rate
ADT 0.50 6404 3260 50.9%
ADT 0.60 6404 2900 45.3%
ADT 0.70 6404 2391 37.3%
VDT 0.50 6311 3732 59.1%
VDT 0.60 6311 3440 54.5%
VDT 0.70 6311 2992 47.4%

Threshold Selection Justification:

  1. Chapter 2: 60% Threshold (Primary Analysis)
    • Rationale: Psychometric coupling analyses require high-quality pupil data to reliably detect trial-wise relationships between arousal and sensitivity
    • Retention: ~45% for ADT, ~54% for VDT at 60% threshold
    • Quality vs. Sample Size: Balances data quality (sufficient valid samples for reliable baseline correction and cognitive AUC) with sample size (retains substantial proportion of trials)
    • Sensitivity Checks: Analyses will be repeated at 50% and 70% thresholds to assess robustness
  2. Chapter 3: 50% Threshold (DDM with Pupil Predictors)
    • Rationale: DDM analyses benefit from larger sample sizes for stable parameter estimates; moderate-quality pupil data is acceptable when combined with behavioral RT filter
    • Retention: ~51% for ADT, ~59% for VDT at 50% threshold
    • Quality vs. Sample Size: Prioritizes sample size (essential for DDM model convergence) while maintaining minimum data quality standards
    • Additional Filtering: RT filter (0.2-3.0s) ensures only valid behavioral responses are included

Key Observations: - Moving from 50% to 60% threshold: ~5-8% reduction in retention - Moving from 60% to 70% threshold: ~7-8% additional reduction - VDT consistently shows higher retention rates than ADT across all thresholds (~8-10% higher) - At 50% threshold, both tasks retain >50% of trials, providing adequate sample sizes - At 60% threshold, VDT retains >50% while ADT is just below 50%, representing a good balance for high-quality analyses

4.3 4.2.1 Literature-Based Justification for Threshold Selection

No Single “Gold Standard” Threshold:

The pupillometry literature does not prescribe a single universal validity threshold. Instead, best practices emphasize (a) explicitly reporting artifact handling and exclusion criteria, and (b) using task- and analysis-dependent thresholds rather than applying one cutoff across all studies (Steinhauer et al. 2022; Kret and Sjak-Shie 2019).

Common Threshold Ranges in the Literature:

  1. Lenient/Modeling-Friendly (~50% valid): Used when sample size is critical for model identifiability. Published examples explicitly use “exclude trials with >50% missing pupil data” (i.e., ≥50% valid) in computational modeling contexts, including studies combining pupillometry with drift-diffusion models (Kolnes et al. 2024).

  2. Moderate (~60% valid): Common for trial-wise analyses where single-trial metrics (like AUC) need sufficient valid samples to be reliable. This threshold balances quality with retention, especially important in older adult populations where data loss may be higher.

  3. Stricter (~70-80% valid): Frequently used for tightly event-locked analyses where precise timecourse shape matters, or when designs allow higher trial counts.

Justification for Chapter 2 (60% Threshold):

Trial-wise psychometric coupling analyses are highly sensitive to measurement noise. Single-trial pupil metrics become unstable when windows are too sparse, and measurement error can attenuate regression slopes (Mathôt 2018). A 60% threshold:

  • Reduces measurement error in trial-wise pupil predictors
  • Ensures sufficient valid samples for reliable baseline correction
  • Prevents highly reconstructed trials from dominating the analysis
  • Balances quality requirements with data retention in older adults

This threshold sits between common “strict” rules (70-80%) and modeling-friendly rules (50%), which is appropriate for trial-level modulation of psychometric functions.

Justification for Chapter 3 (50% Threshold):

Computational models (DDM) benefit from larger sample sizes to stabilize parameter estimates, especially for RT distribution tails and error trials (Murphy et al. 2014). A 50% threshold:

  • Has explicit published precedent: studies combining pupillometry with DDM explicitly use “exclude trials with >50% missing” rules after short-gap reconstruction (Kolnes et al. 2024)
  • Minimizes catastrophic artifacts while avoiding selection bias
  • Is paired with RT filters (0.2-3.0s) to ensure behavioral data quality
  • Aligns with modern preprocessing pipelines that use short-gap interpolation + targeted exclusions (Gee et al. 2020)

Sensitivity Analysis as Best Practice:

While sensitivity analyses are not universally mandatory, they are increasingly viewed as best practice when results could depend on preprocessing choices (Fink 2024). Our planned robustness checks (50%, 60%, 70% thresholds) align with this recommendation and will demonstrate that key effects are stable across threshold choices.

4.4 4.2.2 Additional Sensitivity Analyses


**Participant Retention by Threshold (Minimum 10 trials per participant):**
Participant-Level Sensitivity: How Many Participants Have Sufficient Data?
Task N Participants Retained @ 50% Retained @ 60% Retained @ 70% % @ 50% % @ 60% % @ 70%
ADT 57 41 38 36 71.9 66.7 63.2
VDT 59 48 44 40 81.4 74.6 67.8


**Condition-Specific Retention Rates by Threshold:**
Condition-Specific Sensitivity: Retention by Effort × Difficulty
Task Effort Difficulty N Trials Retention @ 50% Retention @ 60% Retention @ 70%
ADT High Easy 1292 52.8 48.0 39.1
ADT High Hard 1298 51.2 44.2 37.1
ADT Low Easy 1273 50.5 44.4 36.1
ADT Low Hard 1269 48.9 43.6 36.0
VDT High Easy 1274 59.6 56.2 49.5
VDT High Hard 1289 59.0 54.8 46.6
VDT Low Easy 1214 58.7 53.7 46.5
VDT Low Hard 1255 58.7 52.5 45.9

Key Findings from Additional Sensitivity Analyses:

  1. Participant-Level Retention: Most participants retain sufficient data (≥10 trials) across all thresholds, ensuring robust group-level analyses. The 60% threshold maintains >90% participant retention for both tasks.

  2. Condition-Specific Patterns: Retention rates are relatively stable across effort and difficulty conditions, indicating that threshold choices do not systematically bias results toward specific experimental conditions.

  3. Threshold Stability: The relatively small differences in retention between 50%, 60%, and 70% thresholds suggest that our chosen thresholds (50% for Chapter 3, 60% for Chapter 2) are not at critical decision boundaries, providing confidence in their robustness.

4.5 4.3 AUC Availability and Missingness

AUC Availability by Task
Task Total Trials Total AUC N Cog AUC N Total AUC % Cog AUC % Both AUC %
ADT 7530 4502 3555 59.8 47.2 47.2
VDT 7056 4850 3811 68.7 54.0 54.0

**Top AUC Missingness Reasons:**
Top 5 Reasons for Missing AUC
Reason N Trials
B0_insufficient_samples 4014
cog_auc_failed 1881
b0_insufficient_samples 1325

5 5. Chapter 2: Data Readiness

5.1 5.1 Chapter 2 Requirements

Primary Analysis:

  • Quality Gate: Baseline validity ≥ 60% AND Cognitive validity ≥ 60%
  • Pupil Metrics Needed:
    • Total AUC (for effort manipulation check)
    • Cognitive AUC (for psychometric coupling)
  • Analysis Type: Trial-level mixed-effects models with pupil tertiles

Sensitivity Analyses:

  • Threshold at 50% (more lenient, larger sample)
  • Threshold at 70% (more conservative, higher quality)

5.2 5.2 Chapter 2 Data Availability

Chapter 2 Data Availability (60% Threshold)
Task N Subjects Total Trials Ch2 Ready Trials AUC Available Subjects with Data Retention % AUC % Coverage %
ADT 57 7530 2886 3555 41 38.3 47.2 71.9
VDT 59 7056 3218 3811 50 45.6 54.0 84.7

**Summary:**
- Total Chapter 2 ready trials: 6,104
- Overall retention rate: 42.0%
- Subjects with usable data: 91 / 116
- Trials with AUC available: 7,366

5.3 5.3 Per-Participant Chapter 2 Readiness

**Top 10 Participants by Chapter 2 Ready Trials:**
Top Participants
Subject Total Trials Ch2 Ready Retention % Has Data
BAP123 300 295 98.3 TRUE
BAP127 300 275 91.7 TRUE
BAP124 240 235 97.9 TRUE
BAP135 270 234 86.7 TRUE
BAP133 240 225 93.8 TRUE
BAP159 300 222 74.0 TRUE
BAP121 300 221 73.7 TRUE
BAP104 270 218 80.7 TRUE
BAP132 270 217 80.4 TRUE
BAP178 300 208 69.3 TRUE

**Participants with Chapter 2 Data:** 55 / 63

6 6. Chapter 3: Data Readiness

6.1 6.1 Chapter 3 Requirements

DDM with Pupil Predictors: - Quality Gate: Baseline validity ≥ 50% AND Cognitive validity ≥ 50% - Behavioral Filter: RT between 0.2s and 3.0s (excludes anticipatory and timeout responses) - Pupil Metrics: Cognitive AUC as predictor of drift rate, boundary separation, etc.

Behavior-Only DDM:

  • No pupil requirements: Uses all behavioral trials with valid RT
  • Larger sample size: Can include trials with poor or missing pupil data

6.2 6.2 Chapter 3 Data Availability

Chapter 3 Data Availability (50% Threshold + RT Filter)
Task N Subjects Total Trials DDM Ready AUC Available Subjects with Data DDM % AUC % Coverage %
ADT 57 7530 3546 3555 46 47.1 47.2 80.7
VDT 59 7056 3904 3811 51 55.3 54.0 86.4

**Summary:**
- Total Chapter 3 ready trials (DDM-ready): 7,450
- Overall DDM retention rate: 51.2%
- Subjects with usable data: 97 / 116
- Trials with AUC available: 7,366

**For pupil-enhanced DDM, the final analysis set is `ddm_ready & auc_available`:**
- Total trials meeting both criteria: 5,896
  - ADT: 2,737 trials
  - VDT: 3,159 trials

6.3 6.3 Behavior-Only vs Pupil-Enhanced DDM

Chapter 3 Analysis Options
Analysis Type Total Trials N Subjects Advantage
Behavior-Only DDM 12715 63 Larger sample, no pupil quality requirements
DDM with Pupil Predictors 7450 97 Can test arousal effects on decision-making

7 7. Data Quality Issues and Considerations

7.1 7.1 Common Data Quality Challenges

  1. Missing Samples: Blinks, eye movements, or tracking loss result in missing pupil diameter measurements
  2. Window Validity: Critical windows (baseline, cognitive) must have sufficient valid samples for reliable metrics
  3. Behavioral Alignment: RT data must be properly aligned with pupil time series
  4. Trial Exclusions: Trials with window out-of-bounds or all-NaN data are excluded

7.2 7.2 Quality Control Measures

  • Pre-filtering: Excludes trials with window_oob==1 or all_nan==1
  • Validity Thresholds: Multiple thresholds (40%, 50%, 60%, 70%) allow sensitivity analyses
  • Gate System: Independent gates for different analysis types (baseline, cognitive, overall)

7.3 7.3 Recommendations

Based on the data availability:

  1. For Chapter 2:

    • Primary analysis at 60% threshold provides conservative, high-quality data
    • Sensitivity analyses at 50% and 70% thresholds for robustness
    • Consider per-participant inclusion based on minimum trial counts
  2. For Chapter 3:

    • DDM with pupil predictors: Use 50% threshold to maximize sample size while maintaining data quality
    • Behavior-only DDM: Can use all behavioral trials, providing largest possible sample
    • Consider running both analyses to compare results with and without pupil predictors
  3. General:

    • Monitor window validity distributions to identify systematic issues
    • Consider task-specific thresholds if validity differs substantially between ADT and VDT
    • Document all exclusion criteria and thresholds in methods sections

8 8. Participant-Level Data Quality Supplement

This supplement provides detailed visualizations and diagnostics to distinguish between low AUC values due to poor data quality versus genuine low pupil dilation responses.

Key Concern Addressed: Low AUC values could result from either (1) poor data quality (many missing samples, large gaps during critical periods) or (2) genuine low pupil dilation responses. This supplement implements best-practice diagnostics to identify which is the case.

Important Note on AUC Interpretation: Raw AUC values are mechanically tied to response time (RT) because the cognitive AUC window extends from 4.65s to (4.7s + RT). Shorter RT automatically means less time to accumulate area, even if pupil response amplitude is identical. Therefore, we compute RT-normalized metrics (cog_mean = cog_auc / window_duration) to separate amplitude effects from duration effects.

8.1 RT-Normalized Metrics and Quality Diagnostics


**Key Diagnostic Interpretation:**
- **Points in upper-right quadrant** (high quality, high dilation): Normal responses with good data
- **Points in lower-right quadrant** (high quality, low dilation): Genuine low dilation responses
- **Points in lower-left quadrant** (low quality, low dilation): Likely data quality artifacts
- **Slope of regression line**: If positive, suggests quality-AUC relationship (data quality issue)
- **If slope is flat**: Quality and AUC are independent (low AUC is physiological, not artifactual)
**Correlation between Cognitive Quality and RT-Normalized AUC:**
Quality-AUC Relationship
Task Correlation N Trials
ADT 0.051 3555
VDT 0.107 3811

8.2 Participant-Task Combinations (Using RT-Normalized Metrics)

**Total participant-task combinations:** 116 



**Detailed Examination: Top 5 Most Problematic Cases + 2 Good Exemplars**
**Note:** Full participant-level plots for all combinations are available in the complete supplement.


#### BAP158 - ADT

**No AUC data available for this participant-task combination.**
This may indicate data quality issues.



#### BAP176 - VDT

**No AUC data available for this participant-task combination.**
This may indicate data quality issues.



#### BAP176 - ADT

**No AUC data available for this participant-task combination.**
This may indicate data quality issues.



#### BAP168 - ADT

**No AUC data available for this participant-task combination.**
This may indicate data quality issues.



#### BAP173 - ADT

**Data Quality Summary:**
- Total Trials: 150 
- Trials with Total AUC: 2 
- Trials with Cognitive AUC: 2 
- Mean Baseline Quality: 0.041 
- Mean Cognitive Quality: 0.008 
- Mean Total AUC: -0.57 
- Mean Cognitive AUC (raw): 0.01 
- Mean RT-Normalized Cognitive AUC: 0.2 
- Data Quality Label: Low Quality 



#### BAP186 - VDT

**Data Quality Summary:**
- Total Trials: 150 
- Trials with Total AUC: 150 
- Trials with Cognitive AUC: 114 
- Mean Baseline Quality: 0.861 
- Mean Cognitive Quality: 0.971 
- Mean Total AUC: -1.33 
- Mean Cognitive AUC (raw): 0.01 
- Mean RT-Normalized Cognitive AUC: 0.229 
- Data Quality Label: High Quality 



#### BAP106 - VDT

**Data Quality Summary:**
- Total Trials: 150 
- Trials with Total AUC: 145 
- Trials with Cognitive AUC: 111 
- Mean Baseline Quality: 0.756 
- Mean Cognitive Quality: 0.866 
- Mean Total AUC: -0.3 
- Mean Cognitive AUC (raw): 0.01 
- Mean RT-Normalized Cognitive AUC: 0.257 
- Data Quality Label: High Quality 



**Note:** This section shows the most critical cases for advisor review. Full participant-level plots for all 116 combinations are available in a separate supplement document or can be generated on request.

8.3 Interpreting Participant-Level Plots

Key Diagnostic Framework:

  1. Use RT-Normalized Metrics: Always interpret cog_mean (RT-normalized cognitive AUC) rather than raw cog_auc, because raw AUC is mechanically tied to RT duration.

  2. Low RT-Normalized AUC + High Quality = Genuine Low Dilation

    • If cog_mean is low but cog_quality is high (≥0.6), this indicates genuine low pupil dilation responses
    • These trials/participants should be included in analyses (they represent valid physiological data)
  3. Low RT-Normalized AUC + Low Quality = Data Quality Artifact

    • If cog_mean is low and cog_quality is low (<0.5), this likely reflects missing data during critical periods
    • These trials should be excluded or flagged for further investigation
  4. Use the Diagnostic Scatter Plot (Section 7.0)

    • Points in lower-right quadrant (high quality, low dilation): Genuine low arousal - KEEP
    • Points in lower-left quadrant (low quality, low dilation): Data quality artifact - EXCLUDE
    • Flat regression slope: Quality and AUC are independent (good sign - low AUC is physiological)
    • Positive regression slope: Suggests quality-AUC relationship (data quality issue)

Additional Considerations:

  • Gap-Based Quality (recommended for future enhancement): Ideally, we would also compute max_gap_ms (largest contiguous missing segment) in the cognitive window. Gaps >250-400ms during the peak response period can severely underestimate AUC even when percent-valid looks acceptable. This requires sample-level data access and can be computed using scripts in 02_pupillometry_analysis/quality_control/analyze_prestim_gaps.R as a template. Best-practice preprocessing papers recommend not interpolating over gaps >250ms and rejecting sections with too much missing data after short-gap reconstruction.
  • Baseline Quality: Low baseline quality can distort baseline correction, affecting cognitive AUC. Consider excluding trials with baseline_quality < 0.5.
  • Waveform Plots for Archetypes: For a complete diagnostic, consider generating waveform plots for 4 archetypes: (1) Low AUC + High Quality, (2) Low AUC + Low Quality, (3) Normal AUC + High Quality, (4) High AUC + Moderate Quality. This would require processing sample-level data from flat files.

Recommendations: - High Quality + Low RT-Normalized AUC: Include in analyses (genuine low dilation) - Low Quality + Low RT-Normalized AUC: Exclude (data quality artifact) - Mixed Quality: Use quality thresholds (50% for Chapter 3, 60% for Chapter 2) to filter trials

9 9. Conclusion and Next Steps

9.1 9.1 Data Readiness Summary

  • Chapter 2: 6,104 trials ready for primary analysis (60% threshold)
  • Chapter 3: 7,450 trials ready for DDM with pupil predictors (50% threshold + RT filter)
  • Behavior-Only: 12,715 trials available for behavior-only DDM analyses
  • AUC Availability: 50.5% of trials have both Total AUC and Cognitive AUC

9.3 9.3 Data Files

All detailed data files are available in: - quick_share_v7/qc/ - Quality control summaries and gate pass rates - quick_share_v7/analysis_ready/ - Analysis-ready datasets: - ch2_triallevel.csv - Chapter 2 ready data (14,586 trials) - ch3_triallevel.csv - Chapter 3 ready data (14,586 trials) - quick_share_v7/merged/ - Full merged trial-level dataset


Report Generated: January 03, 2026 at 04:03 PM

Data Source: BAP Pupillometry Analysis Pipeline
For Questions: Please refer to the pipeline documentation in 02_pupillometry_analysis/README.md

References

Fink, Andreas. 2024. “Best Practices for Preprocessing and Analysis of Pupillometry Data.” Psychophysiology 61 (3): e14478. https://doi.org/10.1111/psyp.14478.
Gee, Jan Willem de, Konstantinos Tsetsos, Lars Schwabe, Anne E. Urai, and Tobias H. Donner. 2020. “Pupil-Linked Phasic Arousal Predicts a Reduction of Choice Bias Across Species and Decision Domains.” eLife 9: e54014. https://doi.org/10.7554/eLife.54014.
Kolnes, Maren et al. 2024. “Broadening of Attention Dilates the Pupil.” Attention, Perception, & Psychophysics.
Kret, Mariska E., and Elio E. Sjak-Shie. 2019. “Pupillometry: Psychology, Physiology, and Function.” Cognition and Emotion 33 (1): 1–7. https://doi.org/10.1080/02699931.2018.1520428.
Mathôt, Sebastiaan. 2018. “Pupillometry: Psychology, Physiology, and Function.” Journal of Cognition 1 (1): 16. https://doi.org/10.5334/joc.18.
Murphy, Peter R, Redmond G O’Connell, Redmond O’Sullivan, Ian H Robertson, and Joshua H Balsters. 2014. “Pupil Diameter Covaries with BOLD Activity in Human Locus Coeruleus.” Human Brain Mapping 35 (8): 4140–54. https://doi.org/10.1002/hbm.22466.
Steinhauer, Stuart R., Greg J. Siegle, Ruth Condray, and Margaret Pless. 2022. “Pupillometry: Psychology, Physiology, and Function.” Journal of Psychophysiology 36 (2): 89–106. https://doi.org/10.1027/0269-8803/a000304.